Method for Determining Scattered Disparity Fields in Stereo Vision

ABSTRACT

In a system for stereo vision including two cameras shooting the same scene, a method is performed for determining scattered disparity fields when the epipolar geometry is known, which includes the steps of: capturing, through the two cameras, first and second images of the scene from two different positions; selecting at least one pixel in the first image, the pixel being associated with a point of the scene and the second image containing a point also associated with the above point of the scene; and computing the displacement from the pixel to the point in the second image minimising a cost function, such cost function including a term which depends on the difference between the first and the second image and a term which depends on the distance of the above point in the second image from a epipolar straight line, and a following check whether it belongs to an allowability area around a subset to the epipolar straight line in which the presence of the point is allowed, in order to take into account errors or uncertainties in calibrating the cameras.

TECHNICAL FIELD

The present invention deals, in general, with a method for automaticallyanalysing images for stereo vision, in particular with a method fordetermining scattered disparity fields when the epipolar geometry isknown.

A frequent problem to be solved in automatically analysing images isdetermining the disparity existing between two frames showing the samescene, acquired from different points of view with one or more cameras.Disparity is related to the position of the two frames of the same sceneelement. There are algorithms computing the disparity for every imagepixel, as well as others which are limited to a subset of pixels, namelya group of points. The former case deals with dense disparity fields,the latter case deals with scattered disparity fields. The presentinvention is pertaining to the second category.

A widespread technique for determining the scattered disparity fieldexisting between two frames consists in identifying and associatingrelated relevant points, namely points describing characteristic imageareas. There are several algorithms for such purpose, which are based onidentifying angles, edges, outlines or any other feature, which can beplausibly associated between frames, namely reducing the ambiguity. Suchalgorithms have been made for tracking points between generic frames,also taken by a single camera in different time instants.

In the “computer vision” field, there are numerous documents dealingwith the problem of the search for correspondence between differentframes. In particular, in stereo vision, one of the main objectives isdetermining the distance of portrayed objects starting from n differentviews, with n≧2. In case of n=2, for determining the correspondence, theso-called fundamental matrix is used, which describes the relationshipexisting between two images related to the same scene, taking intoaccount the parameters describing the vision system (namely camerafeatures and positions). Such relationship implies that, taken a pointon the first image (frame), its corresponding point on the second image(frame) will lay on a straight line, called epipolar line.

One of the best known algorithms which can be used for computing ascattered disparity field, based on identifying and tracking featuresbeing present in images, is the one described in the article by CarloTomasi and Takeo Kanade “Shape and Motion from Image Stream—Part3—Detection and Tracking of Point Features”, Technical ReportCMU-CS-91-132, April 1991, and known in the art as KLT(Kanade-Lucas-Tomasi) technique. As such article states, there are twoimportant problems to be solved for a correct point tracking betweenframes: how to select the features to be tracked (“feature selection”)and how to track them frame by frame (“tracking”). Such articledescribes in particular an algorithm for performing the tracking,through a 2×2 linear system (which can be translated in a cost functionto be minimised) whose unknown is the two-dimensional displacementvector of the point feature between two frames (disparity vector). TheKLT technique, though if related to sequences of images, can beefficiently used also in case of simultaneous images of the same scene.

The article by J. Shi and C. Tomasi, “Good Features to Track”, IEEEConference on Computer Vision and Pattern Recognition (CVPR94), Seattle,June 1994, being based on the KLT technique, proposes a selectioncriterion of features on which tracking can be done.

The article by J. Mendelsohn, E. Simoncelli, and R. Bajcsy,“Discrete-Time Rigidity-Constrained Optical Flow”, 7th InternationalConference on Computer Analysis of Images and Patterns, Sep. 10-12,1997, deals with the computation of dense fields and imposes thatdisparity vectors (which constitute dense fields) point to positionslying on epipolar straight lines. With this constraint, it is possibleto improve the results that can be obtained, if the epipolar straightlines are accurately known.

The article by H. Najafi and G. Klinker, “Model-based Tracking withStereovision for AR”, Proceedings of the Second IEEE and ACMInternational Symposium on Mixed and Augmented Reality (ISMAR), 2003,proposes to use the fundamental matrix downstream of the KLT techniqueto remove those features for which corresponding points in the secondframe do not rest on epipolar straight lines. Consequently, thedisparity vectors, which are in conflict with the configuration of thetwo cameras, are removed.

BRIEF DESCRIPTION OF THE INVENTION

The Applicant observed that the KLT technique does not exploit thefundamental matrix knowledge, which is often available above all in caseof two or more cameras.

Moreover, the Applicant observed that the disparity computationtechniques which exploit the fundamental matrix constrain the points tostay on the epipolar lines and that such constraint brings about correctresults only assuming an accurate camera calibration. However, thecalibration is typically obtained through estimations and is thereforesubjected to errors or inaccuracies.

The Applicant found that, by suitably modifying the KLT method in orderto take into account information pertaining to the fundamental matrix,which can be computed through a calibration process, it is possible toimprove the quality of the extracted disparity field. In particular, theApplicant found that, by modifying the point tracking process in orderto take into account the fundamental matrix, but without settingabsolute constraints, it is possible to affect at will the process, inorder to compensate for errors in calibrating the system.

In particular, the Applicant found that, by allowing the arrangement todeviate from the epipolar straight line, through the introduction, inthe KLT technique cost function, of a term proportional to the square ofthe distance between epipolar line and searched point, it can bepossible to obtain accurate results also when there is an approximatecalibration. The thereby obtained results can be better both than thosewhich can be obtained by applying the original KLT technique, and thanthose which can be obtained by constraining the identified points torest only on the epipolar straight lines.

Preferably, in order to determine the point of the second imageassociated with the pixel of the first image, the cost functionminimising step is iterated many times, using every time the update ofthe cost function obtained in the previous cycle.

Moreover, the Applicant found that it is possible to further limit thevalidity area with respect to the epipolar straight line. Suchrestriction is represented by a half line (subset of the epipolarstraight line), which brings about the elimination of all disparityvectors which do not point to positions near such half line.

The preferred field of application of the present invention is computervision, in particular in case of shooting from many points of view.

BRIEF DESCRIPTION OF THE FIGURES

The present invention will be described below with reference to theenclosed figures, which show a non-limiting application example thereof.In particular:

FIG. 1 schematically shows a system for video shooting comprising twocameras and blocks composing the process described in the invention; and

FIG. 2 schematically shows images taken by the two cameras, pointing outon one of them the epipolar line associated to a pixel of the otherimage;

FIG. 3 shows the above image pointing out the above epipolar line and anallowability area for disparity vectors;

FIGS. 4 and 5 show a pair of frames of the same scene taken fromdifferent angles, for a comparative test between the inventive techniqueand the KLT technique carried out in the original way;

FIG. 6 shows the result of the KLT tracking technique carried out in theoriginal way;

FIG. 7 shows the result obtained by means of the technique of thepresent invention; and

FIGS. 8 to 9 show comparative details of the results shown in FIGS. 6and 7.

DETAILED DESCRIPTION OF THE INVENTION

With reference to FIG. 1, 100 designates as a whole a system for videoshooting with many cameras. In particular, the system 100 is capable toshoot from different positions of the same scene and to determine thescattered disparity fields between the thereby shot images.

The system 100 comprises a first camera T₁ and a second camera T₂, bothcapable to generate respective digital images (or frames) IM₁, IM₂ ofthe same scene. The two cameras T₁, T₂ can be oriented towards the sceneas mutually parallel or mutually slanted.

The system 100 further comprises a processing unit PU, connected to bothcameras T₁, T₂, and capable to process digital images IM₁ and IM₂received from the two cameras T₁, T₂ to obtain a scattered disparityfield according to the method described below. Moreover, the system 100comprises a camera calibration module CM, preferably a software modulebeing present in the processing unit, or in a separate unit, capable tosupply the processing unit with calibration data, to be used for theprocessing according to the inventive method.

The relationship existing between two simultaneous images of the samescene is similar to the relationship existing in a sequence ofconsecutive images taken by a single camera. The problem of determiningthe displacement of image features can therefore be solved referring tothe KLT tracking technique, with a choice of features as suggested byShi and Tomasi in the previously mentioned article.

A sequence of images can be represented with a function F(x,y,t) inwhich F designates the intensity (scalar one in monochrome case), x andy are the space coordinates (position in the frame) and t is the time. Acommon approximation, on which the KLT technique is also based,expresses the image variation, between times t and t+τ only as spacedistortion

F(x,y,t)=F(x+u(x,y),y+v(x,y),t+τ)  (1)

where u(x,y) and v(x,y) are the displacement amounts in the two spacedirections. Formula (1) is known in literature as “brightness changeconstraint equation” (BCCE).

This means that an image in time t+τ can be obtained by moving everyimage point in time t by a suitable amount, called displacement. Byknowing the displacement of every pixel in an image, it is possible tobuild the disparity field of the image itself, namely a set of vectorswhich, applied to the relevant frame, allow building the frame in theother time.

In practice, the BCCE is not exactly observed. Suffice it to think aboutpartially hidden objects in one of the images: such objects do not“move” from one frame to the following, but simply appear or disappear.The same problem can be found on the image edges, in which objects go inor out of the scene. In any case, the KLT technique takes into accountsome features and not all pixels, and for the features it is possible toobtain an estimation of the BCCE correctness downstream of thedisplacement computation.

The technique of the present invention computes the disparity fieldexisting between two (or more) frames independently for differentfeatures belonging to the scene, which therefore produce a scatteredfield.

Briefly, as shown in FIG. 1, the method of the present inventioncomprises the following steps, which will be described more in detail asfollows:

-   -   calibrating the cameras T₁, T₂, by computing intrinsic        parameters (such as focal length and projection centre) and        extrinsic parameters (roto-translation matrix) of the same        (block 110);    -   extracting features of the image IM₁ (block 120);    -   tracking features on the second image IM₂, taking into account        the calibration parameters (block 130); and    -   removing from computation the points which do not fall within        allowability areas in the second image IM₂, defined below (block        140);

The removing step must be deemed as preferred but not mandatory in themethod of the present invention. The method outputs are the matchesbetween features of the first image IM₁ and corresponding points of thesecond image IM₂.

As already previously stated, the problem of computing the scattereddisparity field can be divided into two parts; how to choose objects tobe tracked, namely image features, and how to track chosen objects. Thepresent invention affects the tracking operation.

The above function F for the two time instants t and t+τ can beexpressed as two functions I and J in the following way:

I(x)=F(x,y,t)  (2)

and

J(x+d)=F(x+u,y+v,tτ)  (3)

where d is a displacement vector associated with the pixel withcoordinates x=(x,y) and containing the two components u and v. Byinserting this replacement, it is possible to rewrite the BCCE expressedby equation (1) in the following way:

I(x)=J(x+d).  (4)

Since the method is applied independently on every selected point, it ispossible to take into account a window of pixels W, centered on theobject whose movement has to be computed (namely on pixel x). Equation(4), in window W, is valid apart from a residual error which can beexpressed as:

$\begin{matrix}{ɛ = {\int_{W}^{\;}{\left\lbrack {{I(x)} - {J\left( {x + d} \right)}} \right\rbrack^{2}\ {{x}.}}}} & (5)\end{matrix}$

The optimum displacement vector d is obtained by minimising the error inequation (5). Equation (5) is not linear, and to solve it, it ispreferable to transform it into a linear one. This is possible when thedisplacement vector d is small, by using the Taylor approximationtruncated at its first order:

J(x+d)=J(x)+g ^(T) d  (6)

where g is the gradient of function J evaluated in x=(x,y),

g(x)={right arrow over (∇)}J(x).  (7)

By making equation (6) discrete for all pixels falling within window W,the following system of equations is obtained:

$\quad\begin{matrix}\left\{ \begin{matrix}{{{{J_{x}\left( p_{1} \right)}u} + {{J_{y}\left( p_{1} \right)}v}} = {{I\left( p_{1} \right)} - {J\left( p_{1} \right)}}} & \; \\\vdots & {\vdots \mspace{281mu}} \\{{{{J_{x}\left( p_{N} \right)}u} + {{J_{y}\left( p_{N} \right)}v}} = {{I\left( p_{N} \right)} - {J\left( p_{N} \right)}}} & \;\end{matrix} \right. & (8)\end{matrix}$

where J_(x)(p_(n)) and J_(y)(p_(n)) are the two elements of the gradientvector in the generic pixel p_(n) and N is the number of pixelscontained in window W.

The system (8) can be expressed in matrix form in the following way:

$\begin{matrix}{{\begin{bmatrix}{J_{x}\left( p_{1} \right)} & {J_{y}\left( p_{1} \right)} \\\vdots & \vdots \\{J_{x}\left( p_{N} \right)} & {J_{y}\left( p_{N} \right)}\end{bmatrix}\begin{bmatrix}u \\v\end{bmatrix}} = {\begin{bmatrix}{{I\left( p_{1} \right)} - {J\left( p_{1} \right)}} \\\vdots \\{{I\left( p_{N} \right)} - {J\left( p_{N} \right)}}\end{bmatrix}.}} & (9)\end{matrix}$

Equation (9) can be rewritten in the compact form Ad=b. Matrix A iscomposed of brightness gradients of the second frame in pixels whichfall within the applied window. Vector b is composed of brightnessdifferences between pixels of the two frames in the same window. Vectord is the displacement to be computed. It can be appreciated that thedisplacement is not necessarily towards a pixel (having thereforediscrete coordinates) but more generically towards a point of the secondimage, which could also be intermediate between two adjacent pixels.

It can be advantageous to assign a relative importance to every pixelbelonging to a window, by multiplying each equation by a weight, in thefollowing way:

$\begin{matrix}{{{VAd} = {Vb}}{where}{V = {\begin{bmatrix}w_{1} & \; & \; \\\; & \ddots & \; \\\; & \; & w_{N}\end{bmatrix}.}}} & (10)\end{matrix}$

The weight assigned to the generic pixel P_(n) is therefore equal toW_(n). The weights are used to give greater importance to pixels beingpresent at the window centre.

The system is over-determined, since it is composed of N equations (N>2)and two unknowns and can be solved through normal equations representingthe minimum square approximation:

A^(T)WAd=A^(T)Wb

Gd=e  (11)

in which

-   -   W=V^(T)V,    -   G=A^(T)WA,    -   e=A^(T)Wb.

The KLT technique further comprises a criterion for choosing points tobe tracked. Such choice can be made with a known technique, for exampleaccording to the teaching of Shi-Tomasi. The chosen criterion mustanyway guarantee the invertibility of matrix G.

All previously stated remarks are also valid for a pair of framesshowing the same scene and which are simultaneously taken, like framesIM₁ and IM₂ in FIG. 1. In this case, in fact, the two functions I and Jare associated with the first image IM₁ and the second image IM₂,respectively. A necessary condition so that the two frames IM₁, IM₂ cansatisfy equation (1) is that the two cameras T₁, T₂ are placed one nearenough to the other and that they shoot the same scene portion.

The relationship between two shootings of the same scene is expressedthrough the epipolar geometry. Every element in the scene, which isvisible in both shootings is projected in the two frames IM₁, IM₂ inpositions which comply with a simple equation. The coefficients of suchequation are contained in the fundamental matrix F, composed of 3×3elements. The positions of projections in the two frames, specified inhomogeneous coordinates, are

$m_{1} = \begin{bmatrix}x_{1} \\y_{1} \\1\end{bmatrix}$ and ${m_{2} = \begin{bmatrix}x_{2} \\y_{2} \\1\end{bmatrix}},$

in which x₁, y₁, x₂, y₂ are the coordinates of the two pixels in the twoframes showing the shot element. The coordinates must comply with theequation

m₂ ^(T)Fm₁=0.  (12)

Therefore, given a point m₁ in the first one of the two frames, point m₂in the second frame must reside on a straight line whose coefficientsare determined by product Fm₁. Symmetrically, given a point m₂ in thesecond frame, point m₁ resides on the straight line with coefficientscontained in product m₂ ^(T)F. These straight lines are called epipolar.These relationships compose the epipolar geometry, which is completelyrepresented by the fundamental matrix F.

In the present specification, both intrinsic and extrinsic cameracalibration parameters, which determine the fundamental matrix, aresupposed known. However, the Applicant observed that, since calibrationcan be affected by estimation errors, it can be incorrect to useinformation provided thereby as an absolute constraint.

In order to associate information deriving from the knowledge of thecalibration parameters, it is useful to interpret the KLT technique as aminimisation of a certain function obtained from equation (9). Thisequation is over-determined and can be solved by a least squaresapproximation; its solution corresponds to finding the minimum of theso-called residue norm, given by Ad-b. It is therefore possible tore-state the previously described problem in the following way:

d=argmin∥Ad−b∥²,  (13)

namely as a cost function to be minimised. Taking into account theweights introduced in equation (10), equation (13) becomes

d=argmin∥Gd−e∥².  (14)

In practice, it occurs that the thereby obtained vector is not alwayscorrect, namely it does not always correspond with the displacement ofobjects from one frame to the other. For example, this can occur whenthe linear approximation, on which the KLT algorithm is based, is notvalid.

In case of two cameras, it is possible to take into account informationrelated to calibration, and therefore to improve the results. Theepipolar geometry provides indications about the area in the secondframe in which the feature has to be searched for computing thedisplacement vector d.

The present invention exploits calibration data by translating them intoa constraint to be added to those already contained in equation (14).The constraint is translated into a cost term linked to the epipolarinformation. If the geometry is exactly known, the searched point shouldlie on the epipolar straight line. Taking into account that calibrationis obtained through estimations, and therefore it is subjected to errorsor inaccuracies, it is more appropriate to assign a valence related tocalibration information. Consequently, it is necessary to allow that thesolution deviates from the epipolar straight line, and this is obtainedin the present invention by introducing a term in the cost function(14), proportional to the square of the distance between the epipolarline and the searched point.

Equation (14) therefore assumes the form

d=argmin{∥Gd−e∥²+λ²ρ(z+d,L)²},  (15)

where ρ is a function expressing the Euclidean distance between pointz+d, namely point z=[x y]^(T) translated by amount d=[u v]T, and L,namely the epipolar straight line associated with point z. The λ factormust be established beforehand and can be determined heuristically, alsodepending on the reliability assigned to the fundamental matrix. Thehigher the value of λ, the more the fundamental matrix correctness isrelied upon. λ can be increased until it constrains de facto thesolution to lie onto the epipolar straight line.

The epipolar straight line is obtained through the fundamental matrix F.Within the second frame, the distance of the straight line from pointz+d can be expressed as

$\begin{matrix}{{{\rho \left( {{z + d},L} \right)} = \frac{{a\left( {x + u} \right)} + {b\left( {y + v} \right)} + c}{\sqrt{a^{2} + b^{2}}}},} & (16)\end{matrix}$

where coefficients a, b and c are given by product

$\begin{matrix}{{\begin{bmatrix}a \\b \\c\end{bmatrix} = F_{z}},} & (17)\end{matrix}$

so that any point m=[x_(m)y_(m)1]^(T) on the line complies with equation

ax _(m) +by _(m) +c=0.

The solution to equation (15) can be obtained through normal equations

$\begin{matrix}{{{{\begin{bmatrix}G \\{\lambda \; p^{T}}\end{bmatrix}^{T}\begin{bmatrix}G \\{\lambda \; p^{T}}\end{bmatrix}}d} = {\begin{bmatrix}G \\{\lambda \; p^{T}}\end{bmatrix}^{T}\begin{bmatrix}e \\{\lambda \; r}\end{bmatrix}}},} & (18)\end{matrix}$

where vectors p and r are defined as

$\begin{matrix}{{p = {\frac{1}{\sqrt{a^{2} + b^{2}}}\begin{bmatrix}a \\b\end{bmatrix}}}{and}} & (19) \\{r = {\frac{- \left( {{ax} + {by} + c} \right)}{\sqrt{a^{2} + b^{2}}}.}} & (20)\end{matrix}$

Equations (11) and (14) can be extended to obtain a more accurate resultthrough an iterative computation which provides, at every step, anupdate of the Taylor series. In this case, the disparity vector iscomposed of many terms. At every step, a term is added, and the finalsolution can be expressed as

$\begin{matrix}{{d = {\sum\limits_{m = 1}^{K}\; d_{m}}},} & (21)\end{matrix}$

where every term is a vector

$\begin{matrix}{d_{m} = \begin{bmatrix}u_{m} \\v_{m}\end{bmatrix}} & (22)\end{matrix}$

Equation (9) in step number M becomes

$\begin{matrix}{{\begin{bmatrix}{J_{x}\left( {p_{1} + {\sum\limits_{m = 1}^{M - 1}\; d_{m}}} \right)} & {J_{y}\left( {p_{1} + {\sum\limits_{m = 1}^{M - 1}\; d_{m}}} \right)} \\\vdots & \vdots \\{J_{x}\left( {p_{N} + {\sum\limits_{m = 1}^{M - 1}\; d_{m}}} \right)} & {J_{y}\left( {p_{N} + {\sum\limits_{m = 1}^{M - 1}\; d_{m}}} \right)}\end{bmatrix}\left\lbrack \begin{matrix}u_{m} \\v_{m}\end{matrix} \right\rbrack} = {\quad \begin{bmatrix}{{I\left( p_{1} \right)} - {J\left( {p_{1} + {\sum\limits_{m = 1}^{M - 1}\; d_{m}}} \right)}} \\\vdots \\{{I\left( p_{N} \right)} - {J\left( {p_{N} + {\sum\limits_{m = 1}^{M - 1}\; d_{m}}} \right)}}\end{bmatrix}}} & (23)\end{matrix}$

or, more compactly

A_(M-1)d_(M)=b_(M-1).  (24)

Consequently, equation (11) becomes, at iteration number M

A_(M-1) ^(T)WA_(M-1)d_(M)=A_(M-1) ^(T)Wb_(M-1)

G_(M-1)d_(M)=e_(M-1)  (25)

The iteration is initialised at M=1, with d₀=0. The total number ofiterations can be set beforehand or determined depending on the norm ofthe last added term.

Normal equations (18) providing the solution to equation (15) can beadapted to the above iterative process, through a modification to vectorr, described in equation (20). Such vector will assume the form

$\begin{matrix}{{r = \frac{- \left( {{a\left( {x + {\sum\limits_{m = 1}^{M - 1}\; u_{m}}} \right)} + {b\left( {y + {\sum\limits_{m = 1}^{M - 1}\; v_{m}}} \right)} + c} \right)}{\sqrt{a^{2} + b^{2}}}},} & (26)\end{matrix}$

at iteration number M.

The process so far described adds information about the displacementvector direction, which has to be computed, namely that equation (15)produces displacement vectors, which agree with the epipolar geometry.However, this cannot be enough to correctly characterise the vectoritself, since the constraint on the epipolar straight line leaves adegree of freedom. This degree of freedom derives from the fact that thesecond term in equation (15) takes the solution towards the epipolarstraight line but not necessarily towards the real feature position inthe second frame. It is possible to reduce such problem by introducingan allowability area of the epipolar straight line and thereforetransforming it into a half line.

This check is based on the fact that, with reference to FIG. 2, it ispossible to associate to feature A in the first frame IM₁ a point P,which can be found at an infinite distance in the taken scene.Therefore, a point B can be located on the epipolar straight line L,associated with the second frame IM₂, which is the projection of pointP. Since any object of the scene, which is projected on point A lies onhalf line AP, the projection of such object in the second frame must lieon half line S, which is the projection of half line AP. Such half lineis therefore a subset of the epipolar straight line.

Taking into account the possibility of deviating from the epipolarstraight line, due to the uncertain calibration reliability, theallowable subset can be extended to an allowability area Z containinghalf line S, with point B as vertex, as can be seen in FIG. 3.

Depending on such observations, it is possible to carry out a check,which allows efficiently removing the displacement vectors, which arenot complying with the scene geometry. The check, useful in cases inwhich point B is within the second frame IM₂, allows thereby removingall candidates {z,d} (feature and displacement vector) for which pointz+d does not lie within the allowability area Z. Area Z can havedifferent shapes. It is necessary that Z is a subset of the plane,delimited by a curve passing by point B o next to it, and which containshalf line S.

Half line S can be expressed as follows:

S={B+K·{right arrow over (e)}:K≧0},  (27)

where K is a non-negative factor and {right arrow over (e)} is adirectional vector parallel to the epipolar straight line.

The elements, which are used for defining the half line S, can beobtained through knowledge of the camera calibration parameters. Amongthem, the intrinsic parameters of the two cameras are contained inmatrices K₁ and K₂, while the rotation matrix and the translation vector(extrinsic parameters) associated with the second machine are called Rand t. For more details, refer to the text by Richard Hartley and AndrewZisserman, Multiple Vision Geometry, Cambridge University Press, 2000,on pages 153-156 and 244. Matrix R and vector t transform the system ofcoordinates belonging to the second camera into the one belonging to thefist. Feature A in the first frame can be expressed in homogeneouscoordinates:

$A = {\begin{bmatrix}x \\y \\1\end{bmatrix}.}$

Point B (projection of the point P that resides at infinity) can beexpressed through A and calibration parameters as:

$\begin{matrix}{{B = {K_{2}\frac{Rw}{R_{3}w}}},} & (28)\end{matrix}$

where R₃ denotes the third row of the rotation matrix R. Vector w isobtained from point A as

$\begin{matrix}{w = {\frac{K_{1}^{- 1}A}{{K_{1}^{- 1}A}}.}} & (29)\end{matrix}$

The directional vector {right arrow over (e)} can be expressed as

{right arrow over (e)}=K ₂((R ₃ w)t−t ₃ Rw),  (30)

where t₃ is the third element of the translation vector t.

As an example, the allowability area can be defined in the followingway, with reference to FIG. 3. The position of the point whosecorrectness has to be verified is given by the sum z+d. The pointbelongs to the allowability area if the half line starting from B andcrossing z+d makes an angle with half line S which is less than acertain threshold. In this case, the check whether it belongs isexpressed in vector terms in verifying the following relationship

$\begin{matrix}{{\frac{\left( {z + d - B} \right)^{T}\overset{\rightarrow}{e}}{{{z + d - B}}{\overset{\rightarrow}{e}}} \geq \theta},} & (31)\end{matrix}$

in which an inner product between standardised vectors exceeds athreshold θε[0,1], to be set. The minimum threshold value is 0, in whichcase the allowability area is the half plane delimited by the straightline perpendicular to the epipolar straight line in point B and whichcontains half line S. The more the threshold value increases towardsvalue 1, the more the allowability area gets restricted. FIG. 3 shows anallowability area corresponding to an intermediate threshold valuebetween 0 and 1, and an example of unacceptable point z+d′ is shown,depending on the above criterion.

As output from the processing unit PU, as result of the aboveprocessing, associations are obtained between features extracted fromthe first frame IM₁ and corresponding points found in the second frameIM₂ and therefore displacement vectors d linked to the pointsthemselves.

Some examples can be observed in FIGS. 4-9. FIGS. 4 and 5 contain a pairof frames of the same scene taken from different angles, for which thealgorithm composing the invention has been applied.

FIG. 6 shows the result of the KLT tracking technique performed in theoriginal way, while FIG. 7 shows the result obtained through thetechnique of the present invention. Both figures point out as segmentsthe displacement vectors of the features chosen for tracking. As can beobserved, in FIG. 7 the segments are mutually oriented much morecoherently, this being an index of a lower presence of errors.

FIGS. 8 and 9 show some significant details in which the benefits givenby the invention are more evident. In each one of these latter images,the top left box (8 a, 9 a) is centered on the feature extracted by thefirst frame; the top right box (8 b, 9 b) is centered on the point inthe second frame obtained through the original KLT technique; the bottomright box (8 c, 9 c) is centered on the point in the second frameobtained through the present invention. In both right boxes, theepipolar straight line associated with the relevant feature is shown,which allows appreciating the strong improvement obtained.

1-11. (canceled)
 12. A method for determining scattered disparity fieldsin stereo vision, comprising the steps of: capturing a first and asecond image of a same scene from two different positions, so that atleast one set of pixels of said first image is associated with acorresponding set of points on said second image; selecting at least onepixel in said first image; and determining a point of said second imageassociated with said pixel, wherein determining said point comprisesminimising a cost function having as a variable a disparity vectorbetween said pixel and said point and depending on a difference betweenthe first and the second image in a window centered on said pixel and ina corresponding window centered on the point to be determined, said costfunction comprising a term which depends in a monotonously increasingway on the distance of said point to be determined from an epipolarstraight line in said second image associated with said pixel.
 13. Themethod according to claim 12, wherein said first and second images arecaptured through a first camera and a second camera, respectively, andsaid term depends on calibration parameters of said first and secondcameras.
 14. The method according to claim 13, wherein said term isweighed depending on an uncertainity of camera calibration data.
 15. Themethod according to claim 12, wherein said term is proportional to thesquare of the distance between said point and the epipolar line.
 16. Themethod according to claim 12, wherein the step of determining the pointof said second image associated with said pixel comprises delimiting insaid second image an allowability area of said point around saidepipolar straight line.
 17. The method according to claim 16, whereinsaid allowability area is contained in a half plane delimited by astraight line perpendicular to said epipolar straight line.
 18. Themethod according to claim 12, wherein said cost function comprisesparameters comprised of brightness gradients in the pixels of saidwindow centered on said pixel in said second image, and of brightnessdifferences between pixels of said window centered on said pixel andpixels of said second image in the same window.
 19. The method accordingto claim 12, wherein the step of determining the point of said secondimage associated with said pixel comprises associating a relative weightto all pixels contained in said window centered on said pixel.
 20. Themethod according to claim 19, wherein the weight associated with pixelsin a central area of said window centered on said pixel is greater thanthe weight associated with pixels in an external area of said window.21. The method according to claim 12, wherein the step of determiningsaid point comprises said step of minimising the cost function.
 22. Asystem for stereo vision, comprising a first and a second shootingcameras adapted to shoot a same scene and a processing unit adapted toreceive from said first and second cameras respective simultaneousimages of said scene, said processing unit capable of being configuredfor processing said respective images according to claim 12.